AITopics

2606.29835

Country: North America > United States > Pennsylvania (0.14)

Genre: Research Report (0.50)

Industry:

Government > Regional Government > North America Government > United States Government (0.89)
Information Technology > Security & Privacy (0.67)

Technology:

Information Technology > Data Science > Data Quality > Data Transformation (0.68)
Information Technology > Security & Privacy (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.46)
Information Technology > Artificial Intelligence > Machine Learning (0.46)

Neural Information Processing SystemsJun-19-2026, 12:36:55 GMT

Adaptive Data Analysis for Growing Data

Reuse of data in adaptive workflows poses challenges regarding overfitting and the statistical validity of results. Previous work has demonstrated that interacting with data via differentially private algorithms can mitigate overfitting, achieving worstcase generalization guarantees with asymptotically optimal data requirements. However, such past work assumes data is static and cannot accommodate situations where data grows over time. In this paper we address this gap, presenting the first generalization bounds for adaptive analysis on dynamic data. We allow the analyst to adaptively schedule their queries conditioned on the current size of the data, in addition to previous queries and responses. We also incorporate time-varying empirical accuracy bounds and mechanisms, allowing for tighter guarantees as data accumulates. In a batched query setting, the asymptotic data requirements of our bound grows with the square-root of the number of adaptive queries, matching prior works' improvement over data splitting for the static setting. We instantiate our bound for statistical queries with the clipped Gaussian mechanism, where it empirically outperforms baselines composed from static bounds.

artificial intelligence, machine learning, natural language, (16 more...)

Genre: Research Report > Experimental Study (1.00)

Industry: Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Neural Information Processing SystemsJun-19-2026, 12:32:09 GMT

Privacy amplification by random allocation

We consider the privacy amplification properties of a sampling scheme in which a user's data is used in k steps chosen randomly and uniformly from a sequence (or set) of t steps. This sampling scheme has been recently applied in the context of differentially private optimization [Chua et al., 2024a, Choquette-Choo et al., 2025] and is also motivated by communication-efficient high-dimensional private aggregation [Asi et al., 2025]. Existing analyses of this scheme either rely on privacy amplification by shuffling which leads to overly conservative bounds or require Monte Carlo simulations that are computationally prohibitive in most practical scenarios. We give the first theoretical guarantees and numerical estimation algorithms for this sampling scheme. In particular, we demonstrate that the privacy guarantees of random k-out-of-t allocation can be upper bounded by the privacy guarantees of the well-studied independent (or Poisson) subsampling in which each step uses the user's data with probability (1+o(1))k/t. Further, we provide two additional analysis techniques that lead to numerical improvements in several parameter regimes. Altogether, our bounds give efficiently-computable and nearly tight numerical results for random allocation applied to Gaussian noise addition.

allocation, artificial intelligence, machine learning, (15 more...)

Country:

Europe (0.45)
North America > United States (0.28)
Asia (0.27)

Genre: Research Report > Experimental Study (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Neural Information Processing SystemsJun-16-2026, 18:24:35 GMT

Differential Privacy for Euclidean Jordan Algebra with Applications to Private Symmetric Cone Programming

In this paper, we study differentially private mechanisms for functions whose outputs lie in a Euclidean Jordan algebra. Euclidean Jordan algebras capture many important mathematical structures and form the foundation of linear programming, second-order cone programming, and semidefinite programming. Our main contribution is a generic Gaussian mechanism for such functions, with sensitivity measured in ℓ2, ℓ1, and ℓ norms. Notably, this framework includes the important case where the function outputs are symmetric matrices, and sensitivity is measured in the Frobenius, nuclear, or spectral norm. We further derive private algorithms for solving symmetric cone programs under various settings, using a combination of the multiplicative weights update method and our generic Gaussian mechanism. As an application, we present differentially private algorithms for semidefinite programming, resolving a major open question posed by [Hsu, Roth, Roughgarden, and Ullman, ICALP 2014].

artificial intelligence, machine learning, natural language, (20 more...)

Country:

North America > United States (0.92)
Asia > Middle East > Jordan (0.82)

Genre: Research Report > Experimental Study (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Liu, Huikang, Selvi, Aras, Wiesemann, Wolfram

Mind the Gap: Mixtures of Gaussians in Approximate Differential Privacy

arXiv.org Machine LearningMay-28-2026

We design a class of additive noise mechanisms that satisfy $(\varepsilon, δ)$-differential privacy (DP) for scalar, real-valued query functions with known sensitivities, with a particular focus on moderate and low-privacy regimes. These mechanisms, which we call \textit{mixture mechanisms}, are constructed by mixing multiple Gaussian distributions that share the same variance but differ in their means and mixture weights. The resulting distributions can be interpreted as convex combinations of a zero-mean Gaussian (as used in the analytic Gaussian mechanism) and additional Gaussians whose means depend on the sensitivity of the query function. We derive tight conditions on the variances required for $(\varepsilon, δ)$-DP and provide efficient algorithms to compute them. Compared to the analytic Gaussian mechanism, our mechanisms yield substantially lower expected noise amplitudes ($l_1$-loss) and variances ($l_2$-loss for zero-mean distributions). In the low-privacy regime that motivates our design, our mechanisms approach optimality, mitigating nearly all of the optimality gap of the analytic Gaussian mechanism.

artificial intelligence, machine learning, mechanism, (17 more...)

2605.28078

Genre: Research Report > New Finding (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (0.67)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Data Science (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

Lampert, Christoph H., Zakerinia, Hossein

From Privacy to Generalization: Linear Max-Information Bounds for DP-SGD

arXiv.org Machine LearningMay-27-2026

Understanding the relationship between generalization and privacy remains a central challenge in modern machine learning theory, particularly for deep networks trained by variants of differentially private stochastic gradient descent (DP-SGD). In this work we make progress on this persistent open problem by proving a finite-sample bound on the approximate max-information of DP-SGD that exhibits scaling properties comparable with (Dwork et al, 2015)'s classic result for $ε$-differentially private algorithms, namely at most linear in the dataset size. From our result we obtain a general-purpose PAC-Bayes generalization bound in which the necessary prior distribution can be learned by DP-SGD, as well as a generalization bound for DP-SGD-trained models themselves, with a complexity term that is fully explicit and controlled by the optimization hyperparameters.

artificial intelligence, dp-sgd, machine learning, (15 more...)

2605.26222

Country:

Europe (0.28)
North America (0.28)

Genre: Research Report (0.70)

Industry: Information Technology > Security & Privacy (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.55)

Ferrando, Cecilia, Fuentes, Miguel, Mullins, Brett, Musco, Cameron, Sheldon, Daniel

Private Adaptive Covariance Estimation via Gaussian Graphical Models

arXiv.org Machine LearningMay-26-2026

We propose PACE-GGM, a data-adaptive differentially private method for covariance estimation that concentrates its privacy budget on the most informative entries of the empirical covariance matrix, rather than perturbing all entries. This applies in the natural setting where the modeler supplies separate bounds for each variable, so that individual entries can be measured with less noise than the full matrix. In each round, our method selects a poorly approximated entry, measures it using the Gaussian mechanism, and then reconstructs a full covariance matrix using a maximum-entropy reconstruction objective, leading to a Gaussian graphical model structure. Experiments on diverse real-world datasets demonstrate consistent improvements in estimation error with respect to the Gaussian mechanism and other baselines, particularly in high-dimensional and low-to-moderate privacy regimes.

artificial intelligence, machine learning, matrix, (15 more...)

2605.24295

Country: North America > United States (0.46)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Neural Information Processing SystemsApr-25-2026, 11:39:18 GMT

BMemorization: Formal treatment To empirically bound the level εof DP, prior work instantiates a general membership inference game, defined in Figure 2 for two arbitrary neighboring datasets D0 and D1

Tables 3 and 4 summarize hyperparameters for PATE-FM and ALIBI respectively. Table 3: PATE-FM (Algorithms 1 and 2) hyperparameters for select accuracy levels. To empirically bound the level εof DP, prior work instantiates a general membership inference game, defined in Figure 2 for two arbitrary neighboring datasets D0 and D1. By repeating this game multiple times, we can estimate the adversary's success rate and convert this into a lower bound on ε. This would be prohibitively expensive in our setting (each iteration of the game requires training a model on CIFAR-10 or CIFAR-100, and the game has to be repeated about 1,000 times to get 13 non-trivial bounds).

adversary, artificial intelligence, machine learning, (16 more...)

Genre: Research Report > New Finding (0.30)

Industry: Leisure & Entertainment > Games (0.35)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)